Picture for Stuart Russell

Stuart Russell

Berkeley

Transformers Provably Learn to Internalize Chain-of-Thought

Add code
May 27, 2026
Viaarxiv icon

Learning the Preferences of a Learning Agent

Add code
May 09, 2026
Viaarxiv icon

Robust and Diverse Multi-Agent Learning via Rational Policy Gradient

Add code
Nov 12, 2025
Viaarxiv icon

GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments

Add code
Sep 26, 2025
Figure 1 for GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments
Figure 2 for GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments
Figure 3 for GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments
Figure 4 for GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments
Viaarxiv icon

The Singapore Consensus on Global AI Safety Research Priorities

Add code
Jun 25, 2025
Figure 1 for The Singapore Consensus on Global AI Safety Research Priorities
Figure 2 for The Singapore Consensus on Global AI Safety Research Priorities
Figure 3 for The Singapore Consensus on Global AI Safety Research Priorities
Viaarxiv icon

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

Add code
Jun 12, 2025
Viaarxiv icon

Provable Sim-to-Real Transfer via Offline Domain Randomization

Add code
Jun 11, 2025
Figure 1 for Provable Sim-to-Real Transfer via Offline Domain Randomization
Figure 2 for Provable Sim-to-Real Transfer via Offline Domain Randomization
Figure 3 for Provable Sim-to-Real Transfer via Offline Domain Randomization
Viaarxiv icon

Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

Add code
May 18, 2025
Viaarxiv icon

AssistanceZero: Scalably Solving Assistance Games

Add code
Apr 09, 2025
Viaarxiv icon

How Do LLMs Perform Two-Hop Reasoning in Context?

Add code
Feb 19, 2025
Figure 1 for How Do LLMs Perform Two-Hop Reasoning in Context?
Figure 2 for How Do LLMs Perform Two-Hop Reasoning in Context?
Figure 3 for How Do LLMs Perform Two-Hop Reasoning in Context?
Figure 4 for How Do LLMs Perform Two-Hop Reasoning in Context?
Viaarxiv icon